Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to disable automatic EFI partitioning. #3458

Closed
wants to merge 1 commit into from

Conversation

FransUrbo
Copy link
Contributor

Sometimes it is desired to not have 'zpool' setup partitioning on devices it uses for the pool. So add a '-D' option to 'add', 'attach', 'create', 'replace' and 'split' to disable the automatic partitioning.

Signed-off-by: Turbo Fredriksson [email protected]
Closes #94
Closes #719
Closes #1162
Closes #3452

@FransUrbo
Copy link
Contributor Author

I'm not a 100% satisfied with this. I had to do API changes that I don't like. But we're not _1.0_ ready just yet, so…

#1162 mentioned -o whole_disk=1 which might be so bad. I'll look into that.

@FransUrbo
Copy link
Contributor Author

I've added a second commit, witch uses -o whole_disk={on,off} instead of the -D option. When we've decided which way to go, we can either rebase them to one commit, or remove the second.

Either one seems fine, but I think I like the latter one better. "Luck have it", the whole_disk property isn't shown in a zpool get all, so...

@DeHackEd
Copy link
Contributor

My concern is there is already a label field called whole_disk (see output of zdb) which is not directly related and may be confusing to users at first. This label field is used by ZFS as a hint that it's allowed to change the elevator, etc.

@FransUrbo
Copy link
Contributor Author

@DeHackEd I noticed this quite late in the 'rewrite', so I figured I could just as well use that instead of the original wholedisk (one word).

Didn't figure it mattered.

@behlendorf
Copy link
Contributor

@FransUrbo -o whole_disk (or -f wholedisk) is definitely the way to go here. It's exactly analogous to the way -o ashift is currently handled and means we can leverage the existing properties interface which was added some time ago. The downside is it will suffer from the exact same quirks we see with ashift. Specifically we're passing it as a global variable when it's really a per-vdev thing.

@DeHackEd actually these two things are directly related. They are one and the same.

@FransUrbo
Copy link
Contributor Author

@behlendorf BECAUSE it have the same down-side as the ashift might be a good thing actually. I've been meaning to figure out a way to look up the "parents" value and use that in the relevant commands.

@FransUrbo
Copy link
Contributor Author

Ok, rebased as one commit using -o whole_disk={on,off}.

/*
* This is somewhat counter-intuitive.
* We've asked for whole disk, but we're
* setting it to false. This because
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/This because/This is because/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that sounds strange… Would "That's because" be better?

@ilovezfs
Copy link
Contributor

@behlendorf @FransUrbo So any thoughts on "correct" behavior when the user explicitly sets -o whole_disk=on, and then proceeds to specify a partition?

@ilovezfs
Copy link
Contributor

I'm a bit confused about usage here. Are we saying whole_disk=on means DO partition or DON'T partition? My understanding is that whole_disk=on should mean DO partition.

@FransUrbo
Copy link
Contributor Author

The whole_disk means literally that - use the whole disk, don't do any partitioning.

But what to do if user say:

zpool create -o whole_disk=on rpool /dev/sda1

I'm not sure...

@ilovezfs
Copy link
Contributor

That's pretty much upside down of what I've always understood whole disk to mean in ZFS.

@FransUrbo
Copy link
Contributor Author

Previously, whole_disk only meant "we own the device". I've here extended that to mean this in the literal sense - we own the device, and we want the whole device.

@ilovezfs
Copy link
Contributor

@ilovezfs
Copy link
Contributor

I would highly recommend not turning this usage upside down.

@FransUrbo
Copy link
Contributor Author

Same code in ZoL - https://github.com/zfsonlinux/zfs/blob/master/cmd/zpool/zpool_vdev.c#L526-L533.

And I don't think it IS up-side-down. I think it makes perfect sense. Whole disk means exactly that in both use cases.

I've clarified this somewhat:

                if (props != NULL) {
                        char *value = NULL;

                        if (nvlist_lookup_string(props,
                            zpool_prop_to_name(ZPOOL_PROP_WHOLE_DISK), &value)
                            == 0) {
                                /*
                                 * There's two meanings of 'whole_disk'.
                                 * 1. ZFS/ZoL 'owns' the device (it's not used
                                 *    by anyone else), so we is allowed to
                                 *    change the elevator etc.
                                 *
                                 * 2. The litteral one, we want the whole disk,
                                 *    no partitions etc.
                                 */
                                if (strcmp(value, "on") == 0) {
                                        /*
                                         * This is somewhat counter-intuitive.
                                         * We've asked for whole disk, but we're
                                         * setting it to false. That's because
                                         * when it's false, later checks won't
                                         * partition the device, which is what
                                         * we ACTUALLY wanted.
                                         *
                                         * In this case, we meant point two,
                                         * hence setting this to false to avoid
                                         * the automatic partitioning.
                                         */
                                        wholedisk = B_FALSE;
                                } else {
                                        /*
                                         * whole_disk=off means we needs to
                                         * figure out if we've specified the
                                         * 'raw' disk device (/dev/sda instead
                                         * of /dev/sda1 for example).
                                         * We still might own the device, but we
                                         * want to automatically paritition the
                                         * device.
                                         *                                                                                                                    
                                         * This is point one above. IF we're
                                         * called with the raw device and not
                                         * with a device and partition.
                                         */
                                        wholedisk = is_whole_disk(path);
                                }
                        }
                } else
                        wholedisk = is_whole_disk(path);

@FransUrbo
Copy link
Contributor Author

The point is, I haven't really changed the behavior. I've extended it to mean the literal (obvious in my opinion) meaning.

If I previously specified whole_disk, but didn't GET the whole disk, I say THAT is broken because ZFS/ZoL partitioned the device and only used a part of it.

With this PR, I get the whole disk when I ask for it explicitly.

@FransUrbo
Copy link
Contributor Author

Maybe I don't need to do anything about the counter-intuitive command line (specifying whole_disk=on AND a partition).

If the user say: zpool create -f -o whole_disk=on rpool /dev/sdc1, the latter will win. IF that partition exists. If it doesn't, then it will fail: cannot resolve path '/dev/sdc1'.

I guess I can fail if both is specified, but I think that's a little overzealous...

@ilovezfs
Copy link
Contributor

If I'm specifying whole_disk=on, it's because I'm overriding what zfs would do otherwise.

zpool create tank sda

ZFS will by default decide sda is a whole disk and will set whole_disk=on. Because whole_disk=on, we will partition the device. Because whole_disk=on, we will chop off the partition from zpool status output. Because whole_disk=on, we will take the responsibility of repartitioning when using online -e in place.

So I intervene, and set -o whole_disk=off because I do not want the default. Now ZFS knows not to partition it. If it's not a whole disk, then there's no need to partition it.

Now suppose I had what might appear to be a whole disk but isn't really because it's virtual (loopback, zvols, /dev/ram, etc.). ZFS notices that even though the device name appears to be a whole disk, it's not really, so there's no reason to partition it.

But I disagree with that default decision. I want to force ZFS to believe this is a whole disk even though it's virtual because I want to use it as a VM boot drive. (Note that on Solaris, lofiadm does not support partitions.) On Linux and OS X, it is often quite "normal" to partition devices even if they're technically virtual. Therefore, I wish to override ZFS's default decision that the virtual device isn't really whole_disk=on material. So I force it, -o whole_disk=on. Now ZFS knows I expect it to partition it, and then use the resulting partition, as well as chop off the partition when displaying the device in zpool status, etc.

@FransUrbo
Copy link
Contributor Author

If you say whole_disk=off you basically is saying you don't really want the whole disk but "do what you think is best" (i.e., partition it and use part of the disk).

If/when you say whole_disk=on, you mean exactly that - "I want the whole disk, don't even think about doing anything clever" (such as partition it etc).

Also, ZFS/ZoL haven't really set whole_disk=on in your code example. Current behavior is just a note that "it" own the device and have been clever about it. Now, with this PR, it means exactly what the word indicates - we want use the whole disk, without any "clever" stuff.

@ilovezfs
Copy link
Contributor

If you want the term to mean something else, use a different term. whole_disk=on precisely means DO try the "clever" stuff such as automatically partitioning it.

@FransUrbo
Copy link
Contributor Author

No, you're misunderstanding the word whole. Whole means … entire, total :).

If you DON'T want the whole thing, then set it to off. If you DO want the whole shebang, then set it to on. This is logical and follows the english meaning of the word whole.

@ilovezfs
Copy link
Contributor

openzfs ~ # zpool history o3xpool | grep 'zpool create'
2014-03-29.06:46:20 zpool create -o ashift=12 -O compression=lz4 -O casesensitivity=insensitive -O atime=off -O normalization=formD o3xpool -f /dev/disk/by-path/pci-0000:00:10.0-scsi-0:0:1:0
openzfs ~ # ls -l /dev/disk/by-path/pci-0000:00:10.0-scsi-0:0:1:0
lrwxrwxrwx 1 root root 9 May 30 11:37 /dev/disk/by-path/pci-0000:00:10.0-scsi-0:0:1:0 -> ../../sdb

openzfs ~ # zdb -l /dev/sdb1 | grep whole_disk | sort | uniq
whole_disk: 1
openzfs ~ # zpool status o3xpool | grep sd
sdb ONLINE 0 0 0

/dev/sdb1 is a whole_disk=on vdev.
It displays as sdb because whole_disk=on.

@FransUrbo
Copy link
Contributor Author

That's because whole_disk means that ZFS/ZoL owns the device. Have nothing to do with the fact that it partitioned the device.

This is still true with this PR. But the added meaning is, when you use it explicitly, you're meaning exactly that - use the whole disk, don't be "funny" about it.

@ilovezfs
Copy link
Contributor

#3458 (comment)
"@DeHackEd actually these two things are directly related. They are one and the same."

@FransUrbo
Copy link
Contributor Author

Yes, my point exactly which I've been trying to get you to understand for the last hour or so…

There IS no change in behavior. Whole disk STILL means "whole disk".

@ilovezfs
Copy link
Contributor

We will clearly have to agree to disagree. I'd expect whole_disk=on to mean "please partition" AND to result in whole_disk=1 on the label of the just-birthed partition(s), and I'd expect whole_disk=off to mean "please don't partition" AND to result in whole_disk=0 on the label of whatever it is that was literally specified.

@FransUrbo
Copy link
Contributor Author

This discussion have shown a problem with the PR.

# ./cmd/zpool/zpool create -f -o whole_disk=on rpool sdc
# zdb | egrep 'path:|whole_disk'
            path: '/dev/sdc'
            whole_disk: 0
# 

The culprit here is is_whole_disk() (actually efi_alloc_and_init() which is_whole_disk() calls).

It assumes that we're expecting partitions on the device, and only if that is true, does it set whole_disk: 1.

I still think this is the correct, logical result following the rules of the english language. But it should in this case also set whole_disk=1 and I'm not sure how to resolve this.

Technically, I think it's wrong to call efi_alloc_and_init() in the first place -zdb doesn't know that zpool was called with -o whole_disk=on.

@FransUrbo
Copy link
Contributor Author

@ilovezfs @DeHackEd I think I know now where you both went wrong. You both seems to believe that whole_disk means "partition the disk". It does [mean] nothing of the sort. It means "we own the device [period]". With a possible parenthesis "(do with it as we please)".

The fact that it DO partition the device is only because of "old habit" (worst excuse ever if you ask me :) - that's what Solaris did. But these two is just coincidental. This is simply a part of the second part "do with it as we please".

What I've done here is clarifying the use - whole_disk STILL mean exactly that (when used by zdb). But if used by zpool it means literally that - "use the whole disk (don't do anything funny with it)".

This is, I'm sure, what @behlendorf meant with the comment "these two things are directly related. They are one and the same". If I misunderstood you @behlendorf, I'm sorry for putting words in your mouth.

I'll see if I can improve on the manpage(s) to make this more obvious, because if you two got it wrong, I bet more people would.

@DeHackEd
Copy link
Contributor

I know what the current (label only) setting means. My only concern was that using "-o whole_disk" to mean something else could potentially be confusing to users at first when they see it - that they could force the whole_disk property on and ZFS might go ahead with its cache and elevator changes. Or how making it "settable only at pool creation time" (as opposed to vdev addition or replacement time) conflicts with expectations compared to how ashift works.

Then again I'm the kind of guy who would just abuse the device mapper to make a disk the way he wants anyway if this feature doesn't exist. :)

@FransUrbo
Copy link
Contributor Author

You say that you know what the current label means, and yet you insist on misunderstanding it.

You're basically saying that "use whole disk" (whole_disk=on) should mean "partition the disk" and "don't use whole disk" (whole_disk=off) should mean "don't partition the disk, use the whole disk".

This is completely illogical and contra to what the word whole means and indicates.

@FransUrbo
Copy link
Contributor Author

Even the code agrees with me here:

 * By "whole disk" we mean an entire physical disk

@aarcane
Copy link

aarcane commented May 30, 2015

Using the whole_disk property should have zero effect on the command in
which it was issued. The primary use case is for an rpool. Create a handful
of partitions, uefi, bios-grub, zfs, and partition 9, then run zpool-create
-o whole_disk=on rpool /dev/disk/by-partlabel/zfs you've manually done any
and all of the partitioning that should be done. You know better than zfs'
detection algorithms. ZFS has the whole disk, it's just too stupid to
figure it out.
On May 30, 2015 9:26 AM, "DeHackEd" [email protected] wrote:

I know what the current (label only) setting means. My only concern was
that using "-o whole_disk" to mean something else could potentially be
confusing to users at first when they see it - that they could force the
whole_disk property on and ZFS might go ahead with its cache and elevator
changes. Or how making it "settable only at pool creation time" (as opposed
to vdev addition or replacement time) conflicts with expectations compared
to how ashift works.

Then again I'm the kind of guy who would just abuse the device mapper to
make a disk the way he wants anyway if this feature doesn't exist. :)


Reply to this email directly or view it on GitHub
#3458 (comment).

@ilovezfs
Copy link
Contributor

@aarcane yes -o whole_disk=on + a specific partition of your choice could make sense but it would require more changes, as the current code assumes we know which partition was anointed:
https://github.com/zfsonlinux/zfs/blob/65037d9b25c2bfa98d0aa5c9e34678127c03b345/lib/libzfs/libzfs_util.c#L901-L926

@FransUrbo
Copy link
Contributor Author

Dang, it seems it wasn't as simple as just 'faking' the wholedisk = B_FALSE in make_leaf_vdev().

This is later in the function (https://github.com/zfsonlinux/zfs/blob/master/cmd/zpool/zpool_vdev.c#L729-L730) to set this value in the nvlist. This in turn is then used in make_disks() to zero the label etc (https://github.com/zfsonlinux/zfs/blob/master/cmd/zpool/zpool_vdev.c#L1191-L1197).

And that seems to be the reason why zdb say whole_disk: 0, even though it technically should be 1 (if we assume that whole_disk means "we own the disk" and not "partition the disk") …

@FransUrbo
Copy link
Contributor Author

I've pushed a new version where I moved the 'faking' of the whole_disk part later in the code. We need to allow for zpool create … sda (without path to the device). I've also reworded and improved on the code comments.

Still need to figure out how to cleanly and correctly make sure that the whole_disk is correct when retrieving later.

@micw
Copy link

micw commented Aug 18, 2015

@FransUrbo, if you are going to rebuild the debian packages for #1350, please consider to pull this commit as well. It seems to be the only was to create partition-less (=auto-growable) pools...

@FransUrbo
Copy link
Contributor Author

@FransUrbo, if you are going to rebuild the debian packages for #1350, please consider to pull this commit as well.

I only decided to include pull requests that isn't "dangerous" (in one way or the other) and this PR isn't quite finished. There's still a couple of things that we need to discuss and iron out.

@micw
Copy link

micw commented Aug 18, 2015

BTW, here's my current workaround:

  • uninstall zfsonlinux, install zfs-fuse (alternatively use a chroot with mounted /dev and zfs-fuse)
  • create the pool on raw device using zfs-fuse
  • export the pool
  • switch back to zfsonlinux
  • import and upgrade the pool

@ryao
Copy link
Contributor

ryao commented Aug 30, 2015

We should not use whole_disk=1 to mean no partitioning because it should serve the function of allowing us to override the default choice of whole_disk in pool creation. We should have a different property, such as partition=raw to specify no partitioning. That way, we could also specify partition=standard for the default behavior and partition=gpt to force gpt. Future extensions could include partition=dos for dos partitioning and partition=vtoc for the partitioning scheme used by Solaris on SPARC. I suppose that specifying a non-default partition scheme should imply whole_disk=1, but whole_disk should be a separate property that can be toggled independently at pool creation (and maybe after pool creation too).

Also, we probably should think about being able to specify reserved space too for overprovisioning. That would be useful on SLOG devices. There is no need to tackle that alongside of this, but we should try to avoid implementing this in a way that makes it difficult to do that in the future. Going with whole_disk=1 (i.e. a boolean) to mean no partitioning is going to make implementing an overprovisioning switch in the command awkward in comparison to something extensible like partition=raw (i.e. a string).

Sometimes it is desired to not have 'zpool' setup partitioning on
devices it uses for the pool. So allow '-o whole_disk={on,off}'
option to 'add', 'attach', 'create', 'replace' and 'split' to
disable or enable, respectivly, the automatic partitioning.

Signed-off-by: Turbo Fredriksson [email protected]
Closes openzfs#94
Closes openzfs#719
Closes openzfs#1162
Closes openzfs#3452
@behlendorf
Copy link
Contributor

I think @ryao's suggestion of a new partition=raw|default|gpt' property is the best way to handle this. The existingwhole_disk` value which is stored in the label is an internal detail and really not something which should be exposed to users.

If this pull request can be refreshed and a few test cases added we can definitely look in to making this change.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is functionality many people would clearly like to have but we need to settle on the best interface for controlling it.

Personally I like @ryao's suggestion of a new partition=<raw|default|gpt> property. The existing whole_disk` value which is stored in the label is an internal detail and really not something which should be exposed to users.

As part of this work I think it would be great to have some way to control the size of those partitions and a test case for this behavior.

Better suggestions for an interface are welcome. If someone would like to work on this please go ahead and open a new PR.

@pepa65
Copy link

pepa65 commented Jan 22, 2023

I think partition=raw is less clear than partition=none.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants